Skip to content

PoC: Add experimental masm-lint CI and fix core MASM findings#3052

Draft
huitseeker wants to merge 5 commits intonextfrom
huitseeker/masm-lint
Draft

PoC: Add experimental masm-lint CI and fix core MASM findings#3052
huitseeker wants to merge 5 commits intonextfrom
huitseeker/masm-lint

Conversation

@huitseeker
Copy link
Copy Markdown
Collaborator

@huitseeker huitseeker commented Apr 24, 2026

This adds a proof-of-concept masm-lint job for the core MASM library. This is not ready for merge: CI checks out the current main commits of trailofbits/masm-lsp and trailofbits/masm-decompiler, then applies the local masm-lsp patch until the upstream fixes land (trailofbits/masm-decompiler#5 and trailofbits/masm-lsp#3).

The decompiler matters because MASM is a stack machine, while the linter needs a value-based view of the code. masm-decompiler lifts stack code into a structured SSA-like IR, infers procedure signatures, and builds type summaries. masm-lint uses that output through masm-analysis. Its advice pass runs a small fixpoint loop: it applies a transfer step, joins the new state into the current one, and repeats until the state stops changing or hits a hard cutoff.

The findings in this PR were small but real (see prior #3021 for issues found using the same tools, but fixed before this PR):

  • u256::eqz had the wrong declared signature.
  • word::testz and word::test_eq now delegate to the canonical word equality helpers.
  • sys::drop_stack_top now declares the full stack shape it consumes.
  • fri::preprocess now asserts that the query and remainder counters are u32 values, and the new tests cover those failures.
  • sorted_array now proves its memory pointers and pointer arithmetic are u32 before they reach memory and alignment-sensitive ops.

@huitseeker huitseeker changed the title PoC: Integrate patched masm-lint on next PoC: Add experimental masm-lint CI and fix core MASM findings Apr 24, 2026
Comment thread crates/lib/core/asm/word.masm Outdated
Comment thread crates/lib/core/asm/word.masm Outdated
@huitseeker huitseeker force-pushed the huitseeker/masm-lint branch from 39618fc to c2734f5 Compare April 24, 2026 15:39
@bitwalker
Copy link
Copy Markdown
Collaborator

What is needed to use our own miden-lsp for this? Just integrating the decompiler?

You mentioned that it is inferring type signatures? Static analysis can obviously check signatures to a limited extent, but inferring signatures would be all but useless due to over-approximation of types. I suspect that it's just computing stack effects (i.e. the number of felts that must have been on the operand stack on entry to be able to even do abstract interpretation at all, and how many remain on exit, and in both cases the values can be constrained to u32 based on instruction context), which our LSP also does - but relying on stack effect analysis for type information leads to a lot of spurious diagnostics once you start doing type checking for real. You basically want to separate the two, and refuse to type check things without type signatures being annotated (and lint on missing signatures) - that way you avoid emitting false positive diagnostics.

Anyway, that doesn't really have any bearing on this PR per se - but I do want this functionality to be provided via our own tooling that we're already working on, rather than splitting our efforts. I'll spend some time today integrating the TOB decompiler (or leveraging the analysis we already have in midenc), and follow up with my findings.

@huitseeker
Copy link
Copy Markdown
Collaborator Author

@bitwalker I was too coarse in the description. The interesting part I meant to describe is that masm-decompiler does infer stack signatures: input felt count, output felt count, net stack effect, and public_inputs. That is not a rich type signature, but it is required so lifting can know how many stack values exist at procedure entry and how many return values to name.

masm-decompiler also has a posterior, separate best-effort type analysis over the lifted IR, which is indeed primitive, and not something used (or emitted) here at all. Further, we're not using the lsp implementation from masm-lsp at all — it just happens to be colocated in the same repo (to which it gives its name) as masm-analysis. This analysis is a basic abstract interpretation loop over the IR, made easy to implement because of its value-oriented SSA form. masm-lint is a small CLI on top of it.

I have plans for how we can move this code somewhere maintained, better discussed elsewhere.

@huitseeker huitseeker force-pushed the huitseeker/masm-lint branch from 757d29c to 32c4c0d Compare April 26, 2026 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants